The nearest neighbor graph (NNG) for a set of n objects P in a metric space (e.g., for a set of points in the plane with Euclidean distance) is a directed graph with P being its vertex set and with a directed edge from p to q whenever q is a nearest neighbor of p (i.e., the distance from p to q is no larger than from p to any other object from P).[1]
In many discussions the directions of the edges are ignored and the NNG is defined as an ordinary (undirected) graph. However, the nearest neighbor relation is not a symmetric one, i.e., p from the definition is not necessarily a nearest neighbor for q.
In some discussions, in order to make the nearest neighbor for each object unique, the set P is indexed and in the case of a tie the object with, e.g., the largest index is taken for the nearest neighbor.[2]
The k-nearest neighbor graph (k-NNG) is a graph in which two vertices p and q are connected by an edge, if the distance between p and q is among the k-th smallest distances from p to other objects from P. The NNG is a special case of the k-NNG, namely it is the 1-NNG. k-NNGs obey a separator theorem: they can be partitioned into two subgraphs of at most n(d + 1)/(d + 2) vertices each by the removal of O(k1/dn1 − 1/d) points.[3]
Another special case is the (n − 1)-NNG. This graph is called the farthest neighbor graph (FNG).
In theoretical discussions of algorithms a kind of general position is often assumed, namely, the nearest (k-nearest) neighbor is unique for each object. In implementations of the algorithms it is necessary to bear in mind that this is not always the case.
NNGs for points in the plane as well as in multidimensional spaces find applications, e.g., in data compression, motion planning, and facilities location. In statistical analysis, the nearest-neighbor chain algorithm based on following paths in this graph can be used to find hierarchical clusterings quickly. Nearest neighbor graphs are also a subject of computational geometry.
Contents |
For a set of points on a line, the nearest neighbor of a point is its left or right (or both) neighbor, if they are sorted along the line. Therefore the NNG is a path or a forest of several paths and may be constructed in O(n log n) time by sorting. This estimate is asymptotically optimal for certain models of computation, because the constructed NNG gives the answer to the element uniqueness problem: it is sufficient to check whether the NNG has a zero-length edge.
Unless stated otherwise, it is assumed that the NNGs are digraphs with uniquely defined nearest neighbors as described in the introduction.